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Abstract — We present a theory of quantum serial turbo- 
codes, describe their iterative decoding algorithm, and study 
their performances numerically on a depolarization channel. 
Our construction offers several advantages over quantum LDPC 
codes. First, the Tanner graph used for decoding is free of 4- 
cycles that deteriorate the performances of iterative decoding. 
Secondly, the iterative decoder makes expUcit use of the code's 
degeneracy. Finally, there is complete freedom in the code design 
in terms of length, rate, memory size, and interleaver choice. 

We define a quantum analogue of a state diagram that 
provides an efficient way to verify the properties of a quantum 
convolutional code, and in particular its recursiveness and the 
presence of catastrophic error propagation. We prove that all 
recursive quantum convolutional encoder have catastrophic error 
propagation. In our constructions, the convolutional codes have 
thus been chosen to be non-catastrophic and non-recursive. While 
the resulting families of turbo-codes have bounded minimum 
distance, from a pragmatic point of view the effective minimum 
distances of the codes that we have simulated are large enough not 
to degrade the iterative decoding performance up to reasonable 
■word error rates and block sizes. With well chosen constituent 
convolutional codes, we observe an important reduction of the 
word error rate as the code length increases. 

I. Introduction 

For the fifty years that followed Shannon's landmark pa- 
per [39J on information theory, the primary goal of the field 
of coding theory was the design of practical coding schemes 
that could come arbitrarily close to the channel capacity. 
Random codes were used by Shannon to prove the existence 
of codes approaching the capacity - in fact he proved that 
the overwhelming majority of codes are good in this sense. 
For symmetric channels this can even be achieved by linear 
codes. Unfortunately, decoding a linear code is an NP-hard 
problem [5], so they have no practical relevance. Making the 
decoding problem tractable thus requires the use of codes with 
even more structure. 

The first few decades were dominated by algebraic coding 
theory. Codes such as Reed-Solomon codes [38] and Bose- 
Chaudhuri-Hocquenghem codes [21], [7] use the algebraic 
structure of finite fields to design codes with large minimal 
distances that have efficient minimal distance decoders. The 
most satisfying compromise nowadays is instead obtained 
from famihes of codes (sometimes referred to as "probabihstic 
codes") with some element of randomness but sufficiently 
structured to be suitable for iterative decoding. They display 
good performances for a large class of error models with a 
decoding algorithm of reasonable complexity. The most promi- 
nent families of probabilistic codes are Gallager's low density 
parity-check (LDPC) codes [16] and turbo-codes [6]. They are 



aU decoded by a belief propagation algorithm which, albeit 
sub-optimal, has been shown to have astonishing performance 
even at rates very close to the channel capacity. Moreover, 
the randomness involved in the code design can facilitate the 
analysis of their average performance. Indeed, probabihstic 
codes are in many aspect related to quench-disordered physical 
systems, so standard statistical physics tools can be called into 
play [46], [29]. 

Quantum information and quantum error correction [41], 
[44], [4], [17], [23] are much younger theories and differ from 
their classical cousins in many aspects. For instance, there 
exists a quantum analogue of the Shannon channel capacity 
called the quantum channel capacity [12], [40], [26], which 
sets the maximum rate at which quantum information can be 
sent over a noisy quantum channel. Contrarily to the classical 
case, we do not know how to efficiently compute its value for 
channels of practical significance, except for quite peculiar 
channels such as the quantum erasure channel where it is 
equal to one minus twice the erasure probabihty [3]. For 
the depolarizing channel - the quantum generalization of the 
binary symmetric channel - random codes do not achieve the 
optimal transmission rate in general. Instead, they provide a 
lower bound on the channel capacity, often referred to as the 
hashing bound. In fact, coding schemes have been designed to 
rehably transmit information on a depolarization channel in a 
noise regime where the hashing bound is zero [14], [42]. 

The stabilizer formalism [17] is a powerful method in 
which a quantum code on n qubits can be seen as classical 
linear codes on 2n bits, but with a parity-check matrix whose 
rows are orthogonal relative to a symplectic inner product. 
Moreover, a special class of stabilizer codes, called CSS codes 
after their inventors [8], [43], can turn any pair of dual classical 
linear code into a quantum code with related properties. The 
stabihzer formalism and the CSS construction allow to import 
a great deal of knowledge directly from the classical theory, 
and one may hope to use them to leverage the power of 
probabilistic coding to the quantum domain. In particular, one 
may expect that, as in the classical case, quantum analogues 
of LDPC codes or turbo-codes could perform under iterative 
decoding as well as random quantum codes, i.e. that they could 
come arbitrarily close to the hashing bound. 

For this purpose, it is also necessary to design a good 
iterative decoding algorithm for quantum codes. For a special 
class of noise models considered here - namely Pauli noise 
models - it turns out that a version of the classical belief 
propagation algorithm can be applied. For CSS codes in 



particular, each code in the pair of dual codes can be decoded 
independently as a classical code. However, this is done at the 
cost of neglecting some correlations between errors that impact 
the coding scheme's performances. For some class of stabilizer 
codes, the classical beUef propagation can be improved to 
exploit the coset structure of degenerate errors which improve 
the code's performances. This is the case for concatenated 
block codes [35] and the turbo-codes we consider here, but 
we do not know how to exploit this feature for LDPC codes 
for instance. Finally, a quantum belief propagation algorithm 
was recently proposed [25] to enable iterative decoding of 
more general (non-Pauli) noise models. As in the classical 
case, quantum belief propagation also ties in with statistical 
physics [20], [24], [25], [36]. 

We emphasize that a fast decoding algorithm is crucial in 
quantum information theory. In the classical setting, when 
error correction codes are used for communication over a 
noisy channel, the decoding time translate directly into com- 
munication delays. This has been the driving motivation to 
devise fast decoding schemes, and is Ukely to be important in 
the quantum setting as well. However, there is an important 
additional motivation for efficient decoding in the quantum 
setting. Quantum computation is likely to require active sta- 
bilization. The decoding time thus translates into computation 
delays, and most importantly in error suppression delays. If 
errors accumulate faster than they can be identified, quantum 
computation may well become infeasible: fast decoding is an 
essential ingredient to fault-tolerant computation (see however 
[13]). 

The first attempts at obtaining quantum analogues of LDPC 
codes [28], [9], [19] have not yielded results as spectacular 
as their classical counterpart. This is due to several reasons. 
First there are issues with the code design. Due to the 
orthogonality constraints imposed on the parity-check matrix, 
it is much harder to construct quantum LDPC codes than 
classical ones. In particular, constructing the code at random 
will certainly not do. The CSS construction is of no help 
since random sparse classical codes do not have sparse duals. 
In fact, it is still unknown whether there exist families of 
quantum LDPC codes with non-vanishing rate and unbounded 
minimum distance. Moreover, all known construction seem 
to suffer from a poor minimum distances for reasons which 
are not always fully understood. Second, there are issues 
with the decoder. The Tanner graph associated to a quantum 
LDPC code necessarily contains many 4-cycles which are 
well known for their negative effect on the performances of 
iterative decoding. Moreover, quantum LDPC codes are by 
definition highly degenerate but their decoder does not exploit 
this property: rather it is impaired by it [37]. 

On the other hand, generalizing turbo-codes to the quantum 
setting first requires a quantum analogue of convolutional 
codes. These have been introduced in [10], [11], [31], [32] 
and followed by further investigations [15], [18], [1]. Quan- 
tum turbo-codes can be obtained from the interleaved serial 
concatenation of convolutional codes. This idea was first 
introduced in [33]. There, it was shown that, on memoryless 



Pauli channels, quantum turbo-codes can be decoded similarly 
to classical serial turbo-codes. One of the motivation behind 
this work was to overcome some of the problems faced by 
quantum LDPC codes. For instance, graphical representation 
of serial quantum turbo-codes do not necessarily contain 4- 
cycles. Moreover, there is complete freedom in the code pa- 
rameters. Both of these points are related to the fact that there 
are basically no restrictions on the choice of the interleaver 
used in the concatenation. An other advantage over LDPC 
codes is that the decoder makes explicit use of the coset 
structure associated to degenerate errors. 

Despite these features, the iterative decoding performance of 
the turbo-code considered in [33] was quite poor, much poorer 
in fact that results obtained from quantum LDPC codes. The 
purpose of the present article is to discuss in length several 
issues omitted in [33], to provide a detailed description of the 
decoding algorithm, to suggest much better turbo-codes than 
the one proposed there, and, most importantly, to address the 
issue of catastrophic error propagation for recursive quantum 
convolutional encoders. 

Non-catastrophic and recursive convolutional encoders are 
responsible for the great success of parallel and serial classical 
turbo-codes. In a serial concatenation scheme, an iimer convo- 
lutional code that is recursive yields turbo-code families with 
unbounded minimum distance [22], while non-catastrophic 
error propagation is necessary for iterative decoding conver- 
gence. The last point can be circumvented in several ways (by 
doping for instance, see [45]) and some of these tricks can be 
adapted to the quantum setting, but are beyond the scope of 
this paper. 

The proof [22] that serial turbo-codes have unbounded 
minimal-distance carries almost verbatim to the quantum 
setting. Thus, it is possible to design quantum turbo-codes 
with polynomially large minimal distances. However, we will 
demonstrate that all recursive quantum convolutional encoders 
have catastrophic error propagation. This phenomenon is re- 
lated to the orthogonality constraints which appear in the 
quantum setting and to the fact that quantum codes are in 
a sense coset codes. As a consequence, such encoders are not 
suitable for (standard) serial turbo-codes schemes. 

In our constructions, the convolutional codes are therefore 
chosen to be non-catastrophic and non-recursive, so there is 
no guarantee that the resulting families of turbo-codes have a 
minimum distance which grows with the number of encoded 
qubits. Despite these limitations, we provide strong numerical 
evidence that their error probability decreases as we increase 
the block size at fixed rate - and this up to rather large block 
sizes. In other words, from a pragmatic point of view, the 
minimum distances of the codes that we have simulated are 
large enough not to degrade the iterative decoding performance 
up to moderate word error rates (10~^ — 10~^) and block sizes 
(102 - 10'*). 

The style of our presentation is motivated by the intention 
to accommodate a readership familiar with either classical 
turbo-codes or quantum information science. This unavoidably 
implies some redundancy and the expert reader may want to 



skip some sections, or perhaps glimpse at them to pick up the 
notation, hi particular, the necessary background from classical 
coding theory and convolutional codes is presented in the 
next section using the circuit language of quantum information 
science. This framework is somewhat unconventional - block 
codes are defined using reversible matrices rather than parity- 
check or generating matrices, convolutional codes are defined 
via a reversible seed transformation instead of a linear filter 
built from shift registers and feed-back lines - yet requires 
little departure from standard presentations. The benefit is a 
very smooth transition between classical codes and quantum 



codes, which are the subject of Sec. Ill Whenever possible, 
the definitions used in the quantum setting directly mirror 
those established in the classical setting. The other benefit 
of this framework is that it permits to generate all quantum 
convolutional codes straightforwardly without being hassled 
by the orthogonality constraint. In fact, the codes we describe 
are in general not of the CSS class. 



Section IV uses the circuit representation to define quantum 



convolutional codes and their associated state diagram. The 
state diagram is an important tool to understand the properties 
of a convolutional code. In particular, the detailed analysis 
of the state diagram of recursive convolutional encoders per- 



formed in Sec. IV-E will lead to the conclusion that they all 
have catastrophic error propagation. Section |V] is a detailed 
presentation of the iterative decoding procedure used for 
quantum turbo-codes. Finally, our numerical results on the 
codes' word error rate and spectral properties are presented 
at Sec. |Vll 

II. Classical preliminaries 

The main purpose of this section is to introduce a circuit 
representation of convolutional encoders which simplifies the 
generalization of several crucial notions to the quantum set- 
ting. For instance, it allows to define in a straightforward way 
a state diagram for the quantum analogue of a convolutional 
code which arises naturally from this circuit representation. 
This state diagram will be particularly helpful for defining 
and studying fundamental issues related to turbo-codes such as 
recursiveness and non-catastrophicity of the constituent convo- 
lutional encoders. The circuit representation is also particularly 
well suited to present the decoding algorithm of quantum 
convolutional codes. 

A. Linear block codes 

A classical binary linear code C of dimension k and length 
n can be specified by a full-rank (ri — k) x n parity-check 
matrix H over F2: 



C = {c\ HlF = 0}. 



(1) 



Alternatively, the code can be specified by fixing the encoding 
of each information word c e F2 through a linear mapping 
c^^c — cG for some full-rank kxn generator matrix G over 
F2 that satisfies GiJ^ = 0. Since G has rank fc, there exists 
an n X fc matrix over F2 that we denote by a slight abuse of 
notation by G^^ satisfying GG~^ = llfc where for any integer 



k, llj, denotes the k x k identity matrix. Similarly, since H 
has rank n — k, there exists a n x [n ~ k) matrix H^^ over 
F2 satisfying HR-^ = ]l„_fc. 

Lemma 1: The right inverses H^^ and G^^ can always be 
chosen such that {H~'^)'^G~^ = 0. 

Proof: Let B = {H-^)^G-\ The substitution R-^ 
H^^ + G'^B'^ preserves the property HH^^ = 11 and fulfills 
the desired requirement. ■ 

We will henceforth assume that the right inverses H^^ and 
G^^ are chosen to fulfill the condition of Lemma [T] 

To study the analogy between classical linear binary codes 
and stabilizer codes, we view a rate - classical linear code and 
its encoding in a slightly unconventional fashion. We specify 
the encoding by an n x n invertible encoding matrix V over 
F2. The code space is defined as 



C={c=(c:0„_fe)F|ceF^}, 



(2) 



where we use the following notation. 

Notation 1: For an n-tuple a S and an m-tuple h E 
^ra Qygj. sQjjjg alphabet si , we denote by a : 6 the n + m- 
tuple formed by the concatenation of a followed by 6. 

Given the generator matrix G and parity check matrix R of 
a code, the encoding matrix V can be fixed to 



Y ■ 



This matrix is invertible: 



G 

{R'Y 



(G-\R^ 



(3) 



(4) 



and satisfies VV^^ = !„ following Lemma [T] Clearly, the 
encoding matrix V : ¥2 ^ F2 specifies both the code space 
and the encoding. The output b — aV of the encoding matrix 
V is in the code space if and only if the input is of the form 
a = (c : On-k) where c G F2. This follows from the equalities 
aV = cG = ceC and {c: s)VH^ = s. 

The encoding matrix also specifies the syndrome associated 
to each error. When transmitted on a bit-flip channel, a 
codeword c will result in the message m = c + p for some 
p E ¥'2. The error p can be decomposed into an error syndrome 
s e F2"'' and a logical error / e F2 as pV~^ = {I : s). This 
is conveniently represented by the circuit diagram shown at 
Fig.[T] in which time flows from left to right. In such diagrams, 
the inverse V^^ is obtained by reading the circuit from right to 
left, running time backwards. This circuit representation is at 
the core of our construction of quantum turbo-codes, it greatly 
simplifies all definition and analysis. 

A probability distribution P(p) on the error p incurred dur- 
ing transmission induces a probability distribution on logical 
transformation and syndromes 



F{l,s)^P{p) 



p=(l:s)V- 



(5) 



We call P{l,s) the pullback of the probability P(p) through 
the gate V. Maximum likelihood decoding 1ml 
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Fig. 1. Circuit representation of encoder {I : s)V = p. Slashed wires with 
integer superscript j indicate a j'-bit input/output. The i-bit input are called the 
logical bits, the other {n ~ fc)-bit input are called syndrome or stabilizer bits, 
and the n-bit output are the physical bits. The string p 6 is a codeword 
if and only if s = On—k- 
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consists in identifying the most likely logical transformation I 
given the syndrome s 

Iml{s) = argmax;P(/|s) (6) 

where the conditional probability is defined the usual way 

P(/,s) 



(7) 



Similarly, we can define the bit-wise maximum likelihood de- 
F2 which performs a local optimization 



coder 11,^ : F^' 
on each logical bit 

I 



ML 



(s) — argmax;iP(r|s), 



(8) 



where the marginal conditional probability is defined the usual 
way 

pais)= j2 Pii\...ns). (9) 

B. Convolutional codes 

We define now a convolutional code as a linear code whose 
encoder V has the form shown at Fig. |2] The circuit is built 
from repeated uses of a linear invertible seed transformation 
U : F^+™ ^ F^'+™ shifted by n bits. In this circuit, 
particular attention must be paid to the order of the inputs 
as they alternate between syndrome bits and logical bits. The 
total number of identical repetition is called the duration of 
the code and is denoted N. The m bits that connect gates 
from consecutive "time slices" are called memory bits. The 
encoding is initialized by setting the first m memory bits to 
Wo = Om- There are several ways to terminate the encoding, 
but we here focus on a padding technique. This simply consists 
in setting the k logical bits of the last t time slices i — 
N +1,N + 2,...N + t equal to k = Ofc, where t is a free 
parameter independent of N. The rate of the code is thus 
k/n + 0{l/N). 

Note that in this diagram, we use a subscript to denote the 
different elements of a stream. For instance, pi denotes the n- 
bit output string at time i. The jth bits of pi would be denoted 
by a subscript as p^, or simply p^ when the particular time i 
is clear from context. This convention will be used throughout 
the paper. 

This definition of convolutional code differs at first sight 
from the usual one based on linear filters built from shift 
register and feed-back lines. An example of a linear filter for 
a rate 1/2 (systematic and recursive) convolutional encoder 



Fig. 2. Circuit diagram of a convolutional encoder with seed transformation 
U. 



fo 




Fig. 3. Representation of convolutional encoder as a linear filter. The labels 
/ and q take value and 1 and indicate respectively the absence or presence 
of the associated wire. Although linear, this transformation is not invertible. 



is shown at Fig. [3] An other common description of this 
encoder would be in terms of its rational transfer function 
which related the /^-transform of the output p{D) to that of 
the input 1{D). Remember that the /^-transform of a bit stream 
xi : a;2 : X3 : . . . is given by x{D) — ^ - XiW-. For the code 
of Fig. [3] the output's I?-transforms are 

p\D) = 1{D) 

^ fo + flD+.-. + frnP^ 
^ ' I + qiD + . . . + q^nD-^" 

where the inverse is the Laurent series defined by long 
division. The code can also be specified by the recursion 
relation 



(10) 

(11) 



wf_l for j > 1 

m 



p1 = ioiE^x^i + y+E/j^-i 



These definitions are in fact equivalent to the circuit of 
Fig. |2] with the seed transformation U specified by Fig.|4] Note 
that we can assume without lost of generality that /„, = 1 or 
Qm = 1 (or both), and these two cases lead to different seed 
transformations. The generalization to arbitrary linear filters is 
straightforward. In terms of matrices, the seed transformation 
associated to this convolutional code encodes the relation 




Fig. 4. Seed transformation circuit for convolutional code of Fig |3] Top: 
Case fm = 1. Bottom: Case qm = 1- The labels / and q take value and 1 
and indicate respectively the absence or presence of the associated gate. Both 
circuits are entirely built from controUed-nots, and are therefore invertible. 
As its name indicates, the controlled-not acts by negating the target bit © if 
and only if the control bit • is in state I . 



(Pj : Wt) = {wi-i : k : Si)U with U given by 



where 



/ip = 
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Ap = (1, /o), and Am = (1 Om-i)- The two other components 
depend on whether fm — I or q„i — 1. In the former case 
= (O7/0) ^nd Sm = (1 0,„_i) while in the latter case 
Ep=(0,l) and Sm = (0„0. 

Not only does the circuit of Fig. |2] produce the same 
encoding as the linear filter of Fig. |3] it also has the same 
memory states. More precisely, the value contained in the jth 
shift register at time i in Fig. [3] is equal to the value of the 
jth memory bit between gate i and « + 1 on Fig. |2] This is 
important because it allows to define the state diagram (see 
Sec. |1V-B| | directly from the circuit diagram Fig. |4] 

Of particular interest are systematic recursive encoders that 
are defined as follows. 

Definition 1 (Systematic encoder): An encoder is system- 
atic when the input stream is a sub-stream of the output stream. 

Definition 2 (Recursive encoder): A convolutional encoder 
is recursive when its rational transfer function involves gen- 
uine Laurent series (as opposed to simple polynomials). 

Systematic encoders copy the input stream in clear in one 
of the output stream. Typically they have transfer functions of 
the form p'{D) = P{D) for j ^ 1, . . . , k and arbitrary p'{D) 
for i > k, so pI is a copy of . The systematic character of 



the code considered in the above example is most easily seen 
from Fig. [s] is a copy of the input /. Systematic encoders 
are used to avoid catastrophic error propagation. This term will 
be defined formally in the quantum setting, but it essentially 
means that an error affecting a finite number of physical bits 
is mapped to a logical transformation on an infinite number 
of logical bits by the encoder inverse. Catastrophic encoders 
cannot be used directly in standard turbo-code schemes. The 
problem is that the first iteration of iterative decoding does not 
provide information on the logical bits. This is due to the fact 
that as the length of the convolutional encoder tends to infinity 
and in the absence of prior information about the value of the 
logical bits, the logical bit error rate after decoding tends to ^. 

A recursive encoder has an infinite impulsive response: on 
input / of Hamming weight 1, it creates an output of infinite 
weight for a code of infinite duration TV. Recursiveness is also 
related to the presence of feed-back in the encoding circuit, 
which is easily understood from the linear filter of Fig. [3] 
Except when the polynomial ^ qiD^ factors ^ fiD\ an 
encoder with feed-back will be recursive. It is essential to use 
as constituent recursive convolutional codes in classical turbo- 
codes schemes to obtain families of turbo-codes of unbounded 
minimum distance and with performances which improve with 
the block size. 

III. Quantum Mechanics and Quantum Codes 

In this section, we review some basic notions of quantum 
mechanics, the stabilizer formalism, and the decoding problem 



for quantum codes. In Sec. III-B stabilizer codes are defined 
the usual way, as subspaces of the Hilbert space stabilized 
by an Abelian subgroup of the Pauli group. We detail in 
Sec. IIII-CI how these codes are decoded. Even if a stabilizer 
code is a continuous space, it can be defined and studied 
by using only discrete objects (parity-check matrix, encoding 
matrix, syndrome) which are quite close to classical linear 
codes. We discuss in Sec. IIII-DI the relations between such 
quantum codes and classical linear codes but also highlight 
the crucial distinctions between them. Particular emphasis is 
put on the role of the encoder because it is a crucial ingredient 
for our definition of quantum turbo-codes. The encoder also 
provides an intuitive picture for the logical cosets, which are 
an important distinction between classical codes and quantum 
stabilizer codes. 

A. Qubits and the Pauli group 

A qubit is a physical system whose state is described by 
a unit-length vector in a two-dimensional Hilbert space. The 
two vectors of a given orthonormal basis are conventionally 
denoted by |0) and We identify the Hilbert space with 
in the usual way with the help of such a basis. The state 
of a system comprising n qubits is an unit-length vector in 
the tensor product of n two-dimensional Hilbert spaces. It 
is a space of dimension 2" which can be identified with 
(C^)*^" ~ . It has a basis given by all tensor products 
of the form |a;i) ® ■ ■ ■ ® |a;„), where the Xi e {0, 1} and the 
inner product between two basis elements |a;i) ® ■ ■ ■ ® |a;„) 



and \yi) ® ■ ■■ ® |y„) is the product of the inner products of 
\xi) with the corresponding jt/j). In other words, this basis is 
orthonormal. It will be convenient to use the following notation 

Notation 2: 

|0„) ^ |0) g) • • • ® |0) . 

^ V ' 

n times 

The error model we consider in this paper is a PauU- 
memoryless channel which is defined with the help of the 
three Pauli matrices 

These matrices anti-commute with each other and satisfy the 
following multiplication table 
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where 3 denotes the 2x2 identity matrix. The action of 
these operators on the state of a qubit is obtained by right 
multiplication \tjj) J'|V')> with \tjj) viewed as an element of 

C^. 

These matrices generate the PauU group '^i which is readily 
seen to be the set 

{±3, ±i3, ±X, ±iX, ±y , ±iy, ±Z, ±iZ}. 

They also form all the errors which may affect one qubit in 
our error model. If we have an n-qubit system, then the errors 
which may affect it belong to the PauU group over n qubits 
which is defined by 

— "'I 

= {e?i (g)---(g>T„\ee {±l,±i}, e {3., X, Z}} 

This group is generated by i and the set of Xj's and Zj's for 
i = 1, 2, . . . , n which are defined by: 
Notation 3: 

i — 1 times n — i times 

Xi = 3(g) (g) 3 (E)X (g)'3®^- ■ (g) 3 

i — 1 times n — i times 

Zi = 3(g) (g) 3(S)Z(g)3 (g) (g) 3 
In quantum mechanics two states are physically indistin- 
guishable if they differ by a multiplicative constant. This 
motivates the definition another group of errors, called the 
effective Pauli group, obtained by taking the quotient of 
by {±3, ±i3}. 

Definition 3 (Effective Pauli group): The effective Pauli 
group Gn on n qubits is the set of equivalence classes [?] for 
J" in where the equivalence class [T] is the set of elements 
of which differ from ? by a multiplicative constant. We 
will also use the notation I ^3], X ^X],Y ^% Z ^Z] and 
Xi = [Xi], Zi = [Zi]. 

All the effective Pauli groups G„ are Abelian. (Gi , +) is 
isomorphic to (F2 x F2, +) where the group operation of Gi 



corresponds to bitwise addition over F2 XF2. As a consequence 
effective Pauli operators can be represented by binary couples. 
We will henceforth make use of the following representation 

I ^ (0,0) (13) 

X ^ (1,0) (14) 

Y ^ (1,1) (15) 

Z ^ (0,1) (16) 

Note that G„ = G" and we will either view, depending on 
the context, an element P E Gn as an n-tuple (-P*)f^i with 
entries in Gi or as 2n-tuple with entries in F2 obtained by 
replacing each Pj by its corresponding binary representation. 
G„ is generated by the Xi and Zi, and we introduce the 
following notation. 

Notation 4: For P in G„, we denote by and P^ the 
only elements of G„ satisfying: 

1) p = p^ + p\ and 

2) P"" € {/,x}",P^ e {/,^}". 

An important property of is that any pair of elements 
T, Q either commutes or anti-conunutes. This leads to the 
definition of an inner product for elements P = (Pi)i<i<n 
and Q = ((5i)i<i<„ of G„ such that P * Q = X^Li Pi * 
mod 2. Here, Pi * Qi = 1 if Pi ^ Qi, Pi ^ I and Qi ^ 7; 
and Pi*Qi = otherwise. 

Fact 1: V,Qe^n commute if and only if [?] * [Q] = 0. 

This product can also be defined with the help of the 
following matrix which will appear again later in the definition 
of symplectic matrices. 

Notation 5: 

A„ = 11„ (g) X. 

By viewing now elements of G„ as binary 2n-tuples we have: 

Definition 4 (Inner product): Define the inner product ★ : 
G„ X G„ ^ F2 by P*Q = PA„Q^. 

Gn is an F2-vector space and we use the ★ inner product to 
define the orthogonal space of a subspace of G„ as follows. 

Definition 5 (Orthogonal subspace): Let V he a subset of 
G„. We define V-^ by 

={P e G„ : P*Q = for every Q e V}. 
V-^ is always a subspace of G„ and if the space spanned by 
V is of dimension t, then V-^ is of dimension 2n — t. 

From the fact that two states are indistinguishable if they 
differ by a multiplicative constant, a Pauli error may only be 
specified by its effective Pauh group equivalence to which 
it belongs. A very important quantum error model is the 
depolarizing channel. It is in a sense the quantum analogue 
of the binary symmetric channel. 

Definition 6 (Depolarizing channel): The depolarizing 
channel on n qubits of error probability p is an error model 
where all the errors which occur belong to G„ and the 
probability that a particular element P is chosen is equal to 
(1 - p)"-*'('P) (l)"^^^^ where the weight w(P) of a Pauli 
error is given by 



Notation 6: w{P) is the number of coordinates of P which 
differ from /. 

In other words, the coordinates of the error are chosen indepen- 
dently: there is no error on a given coordinate with probability 
I — p and there is an error on it of type X,Y or Z each with 
probability |. 

B. Stabilizer codes: Hilbert space perspective 

A quantum error correction code protecting a system of k 
qubits by embedding them in a larger system of n qubits is 
a 2^ dimensional subspace ^ of (C^)^". We say that it is a 
quantum code of length n and rate ^. It can be specified by 
a unitary transformation V 



-■2" 



-<2". 



(17) 



This definition directly reflects Eq. (|2]i. As in the classical 
case, the matrix V specifies not only the code but also the 
encoding, that is the particular embedding (C^)'*''' (C^)®". 
An importance distinction however is that in the quantum 
case, the dimension of the matrix V is exponential in the 
number of qubits n. To obtain an efficiently specifiable code, 
we choose V from a subgroup of the unitary group over 
(C^)®" called the Clifford group. In fact, not only are Clifford 
transformations over n qubits efficiently specifiable, they can 
also be implemented efficiently by a quantum circuit involving 
only 0{n^) elementary quantum gates on 1 and 2 qubits (see 
Theorem 10.6 in [30] for instance). 

Definition 7 (Clifford transformation and Clifford group): 
A Clifford transformation over n qubits is a unitary transform 
V over (C^)**" which leaves the Pauli group over n qubits 
globally invariant by conjugation 

The set of Clifford transformations is a group and is called 
the Clifford group over n qubits. 

This definition naturally leads to the action of the Clifford 
group on elements of the Pauli group. 

Definition 8 (Action of Clifford transformation on Pauli): 
A Clifford transformation V acts on the Pauli group as 

(£ ^ c^l 

It also acts on the effective Pauli group by the mapping [T] ^ 

The last mapping is F2-linear and there is a square binary 
matrix V of size 2n which is such that 

This matrix will be called the encoding matrix. 

Definition 9 (Encoding matrix): The encoding matrix V 
associated to an encoding operation V, which is a Clifford 
transformation over n qubits, is the binary matrix V of size 
2n X 2n such that for any T e '^^„ we have 



Clearly then, a Clifford transformation on n qubits can be 
specified by its associated encoding matrix V on Fj" together 
with a collection of 2n phases. This shows that Clifford trans- 
formations are efficiently specifiable as claimed. It can readily 
be verified that the rows of V, denoted Vi i — 1,2, .. . ,2n, 
are equal to 



(18) 
(19) 



Since conjugation by a unitary matrix V does not change the 
commutation relations, the above equations implies that the 
encoding matrix is a symplectic matrix, whose definition is 
recalled below. 

Definition 10 (Symplectic transformation): A n-qubit sym- 
plectic transformation is a 2n x 2n matrix U over F2 that 
satisfies 

C/A„C/^ = A„. 

By definition, symplectic transformation are invertible and 
preserve the inner product ★ between n-qubit Pauli group 
elements. Conversely, every symplectic matrices always cor- 
respond to a (non-unique) Clifford transformation. 

A stabilizer code is thus a quantum code specified by 
Eq. ([T7]i, but with V in the Clifford group. The code <^ (but 
not the encoding) can equivalently be specified with n ~ k 
independent mutually commuting elements of ^#'„ of order 2 
as follows: 

Definition 11 (Stabilizer code): The stabilizer code as- 
sociated to the stabilizer set {'Ki,i = l..n — k}, where the 
Jii's are independent mutually commuting elements of of 
order 2 and different from —1, is the subspace of (C^)*^" of 
elements stabilized by the !Ki's, that is 

'^=m\^{^W = \^)^<^<n-k}. (20) 
This is the usual definition of stabilizer codes. The Jfi play 
a role analogous to the rows of the parity-check matrix of a 
classical linear code, and this connection will be formalized 
in Subsection III-D | To see the equivalence between this 
definition and Eq. (|l7|i, set !Ki = VZk+iV^ ■ These operators 
are independent and of order 2 since they are conjugate to 
the Zi which are independent and of order 2. Now, consider 
a I?/") e as defined in Eq. ( [T7| ). For all we have 



IK,; 



VZ,+fcVtV(|^) (g> |0„_fe)) (21) 
V(|7/.)®Z,|0„_fc)) = IV^>, (22) 



where we used the fact that Z|0) = |0). Hence, \tp) satisfies 
the condition of Def. [TT| Conversely, for any state lip) £ ^ 
according to Def. [TT] we have 



(23) 
(24) 



which implies that the k + ith qubit of V^i/j) must be in state 
|0). Since this holds for all i — 1, 2, . . . , n — fc, we conclude 
that the two definitions are equivalent. This equivalence has 
the following consequence: 



Fact 2: A stabilizer code of length n associated Xo n — k 
independent generators Jf^ is of dimension 2^. 

Since X,y,Z are all of order 2, all the generators of order 
2 in ^„ are of the form iJ" where CP is a tensor product of n 
matrices all chosen among the set {3. X,y, Z}. Thus, we can 
specify the generators "Ki of the stabilizer code by giving only 
the associated effective Pauli group elements together with 
a sign for each generator Changing the sign of a stabilizer 
generator changes the code, but not its propertied More 
precisely, the set of Pauli errors which can be corrected by 
such a code does not depend on the signs which have been 
chosen. Hence, we can specify a family of "equivalent" codes 
by specifying instead of the J{j's the set of Hj = ['Kj\ = 
VZj+k- It is important to note that these elements have to 
be orthogonal: the fact that the Sii's commute translate into 
the orthogonality condition Hi -kHj = 0. Thus, the Hi span a 
linear space called the stabilizer space, that we denote C{I) 
for reasons that will become apparent later. 

Thus, in analogy with classical linear codes, a stabilizer 
code (or more precisely an equivalent class thereof) can be 
efficiently specified by an encoding matrix V on Fj". This 
matrix also provides an efficient description of the encoding 
up to a set of phases. There is another analogy with a 
classical encoding matrix that will be crucial for our definition 
of quantum turbo-codes. Assume that we concatenate two 
stabilizer codes and that these codes are encoded by Clifford 
transformations. The result of the concatenation is also a 
stabilizer code (because Clifford transformations form a group) 
and the resulting encoding matrix is just the product of the two 
encoding matrices of each constituent code. This reflects the 
fact that the encoding matrices provide a representation of the 
Clifford group. 

Fact 3: Let Vi and V2 be two Clifford transformations over 
n qubits with encoding matrices Vi and V2 respectively. Then 
V2V1 is a Clifford transformation with encoding matrix V1V2. 

Proof: Consider the Clifford transformation V = V2Vi. 
It suffices to verify the statement on a generating set of the 
Pauli group: 



ViX.V} 



Vo 



(25) 



Equation (25 1 uses the fact that ViX^Vj belongs to '^n. The 



same kind of result holds for the Z^'s and this completes the 
proof. ■ 

C. Decoding 

When transmitted on a Pauli channel, an encoded state 
= V(IV') ® |0„_fc)) (where IV') belongs to will 
result in a state y\4>) for some T G Upon inverting the 

'This is strictly true for Pauli channels which are considered here. For a 
general noise model, error correcting properties may actually depend on the 
sign of the stabilizer generators. 



encoding we obtain the state 

= (i:|V))®(§|o„_fc)), 

where L belongs to and § = aSi ® ■ ■ ■ ® §n-k belongs 
to J#„_fe (and the S^'s to {3, X, y, Z}). Notice that S|0„_fe) is 
equal to e|si) (8) • • • (8) \sn-k) where e e {±l,±i} and 



s, = if §, e {d,Z}, 
Si = 1 otherwise. 



(26) 
(27) 



Measuring the n — k last qubits reveals si . . . Sn-k which 
is the analogue of a classical syndrome. This motivates the 
following definition. 

Definition 12 (Error syndrome): The syndrome s{f) asso- 
ciated to a Pauli error CP is the binary vector (si)i<,;<„_fc 
defined by Equations ( [26| and ( [27] i. 

Note that the syndrome s{P) can be obtained from the i/^'s 
(which are defined as in the previous subsection by Hi = 
[VZfc+,Vt] = Zk+^V) by 

Proposition 1: 

S(J') = H,)l<.<„-fe. 

Proof: Si(T) is equal to [CP]y~^ * Zi^k by definition. 
Since symplectic transformations preserve the symplectic inner 
product we deduce that Sj(T) = {[y]V~^) Z,+k = [CP] * 

z,+kV =[y]*H,. ■ 

This proposition motivates the following definition of a 
parity-check matrix of a stabilizer code 

Definition 13 (Parity-check matrix): The parity-check ma- 
trix H of a quantum code with stabilizer set {Hi, . . . , H„-k} 
is the binary matrix of size (n — k) x 2n with rows 

Hi,. . . , Hn-k- 



L- 



V 



Fig. 5. Circuit representation of encoder (L : S)V = P. The operator 
P S G„ is a codeword (has trivial syndrome) if and only if S S {/, Z}""*. 



The calculation of the syndrome depends only on the 
effective Pauli error P = [CP]. As we did for classical errors 
in Sec. II-A| it will be convenient to decompose the error 
as PV^~^ (L : S), with L e Gk and 5* e G^-k- Like 
in the classical case, this is conveniently represented by the 
circuit diagram of Fig. |5] At this point however, the analogy 
with the classical case partially breaks down. As described in 
Section |II-A| in the classical setting a bit-flip error p can be 
decomposed as pV~^ = {I : s). In that case, s is the error 
syndrome and is therefore known. Decoding then consists in 
identifying the most likely I given knowledge of s. In the 
quantum case however, S is only partially determined by 
the error syndrome s(P). Indeed, we can decompose S as 
S = S'^ + S"" (c.f. Notation |4]), and notice that from (|26]) 



and (27 1, s{P) reveals only S^. More precisely, we have the 



following relation for the i-th component Sf of 5^ 



St 
Sf 



X if = 1 
/ otherwise. 



Hence, two physical errors P — {L : + S^)V and P' = 
{L: S^ + S'^)V ^ P+ih- + S"')V have the same error 
syndrome |^ S^, so cannot be distinguished. However, they 
also yield the same logical transformation L, so they can be 
corrected by the same operation (namely applying L — L^^ 
again). Therefore, they cannot and need not be distinguished 
by the error syndrome: such errors are called degenerate. This 
reflects the fact that all errors of the form P — (/j. : S^)V 
(with E {I, Z}"^'^) have zero syndrome but do not need 
to be corrected. We denote such kind of errors by 

Definition 14 (Harmless undetected errors): The set of er- 
rors P of the form P — {Ik : S^)V where ranges over 
{/, Zy^^'' is called the set of harmless undetected errors. 
All the other errors of zero syndrome (and which are therefore 
undetected) have a non trivial action on the k first qubits 
after inverting the encoding transformation. This motivates the 
following definition 

Definition 15 (Harmful undetected errors): The set of er- 
rors P of the form P — (L : S^)V where ranges over 
{/, ZY^~^ and L is different from Ik is called the set of 
harmless undetected errors. 

Note that the set of errors of the form [Ik : S^yV with S^' 
in {/, Z}"^'^ is also the subgroup spanned by the rows V2i for 
j e {fc+ 1, . . . , n}, or what is the same, the subgroup spanned 
by the Hi = Zi^kV for i £ {1, . . . ,n ~ k}. In other words 

Proposition 2: The set of harmless undetected errors is 
equal to C(/). 

This fact that there are errors which do no need to be 
corrected has an important consequence. Contrarily to the 
classical setting where the most likely error satisfying the 
measured syndrome is sought, in the quantum case, we look 
for the most likely coset of C{I) satisfying the measured 
syndrome. Such a coset is the set of errors of the form 

Definition 16 (Logical coset): Given an encoding matrix 
V, the logical coset C{L,S^) associated to the logical trans- 
formation L £ Gk and to the syndrome (belonging to 
{/, X}"-*^) is defined as 

= {P={L:S' + S'=)V\S'e{I,Z}"-''} 
= {L: S'')V + C{I). 

When — In-k we simply write C{L) instead of 

C(L, In-k)- 

What replaces the classical probability that a given information 
sequence has been sent given a measured syndrome is in 
the quantum case the probability 'P{L\S^) that applying the 
transformation L^^ ~ L to the k first qubits after performing 
the inverse of the encoding operation corrects the error on 

-By a slight abuse of terminology, we use the one-to-one correspondence 
between s and to refer to both quantities as the error syndrome. 



these qubits. It corresponds to the probability that the error 
belongs to the coset C{L,S^) which is therefore equal to 



P(L|5^) = 



P(L,g^) 



(28) 



with the probability P{L, S"^) is the puUback of P(P) through 
the encoding matrix 



P(L,S'^) 



E 



P(P) 



P={L:S^ + S^)V- 



(29) 



Similarly to the classical setting, maximum likelihood de- 
coding consists in identifying the most likely logical transfor- 
mation L given the syndrome S^. More formally: 

Definition 17 (Maximum likelihood decoder): The 
maximum likelihood decoder Lml ■ {I, X}""''' Gk 
is defined by 

LMLiSn = argmaXiP(L|5^) (30) 
The classical MAP decoding (or bit-wise decoding) has also 

a quantum analogue 

Definition 18 (Qubit-wise maximum likelihood decoder): 

The qubit-wise maximum likelihood decoder L\^j^ : 

{/, Xy-^ -> Gi is defined by 



7,,^(5-)=argmaXi.P(r|5-) 



(31) 



where the marginal conditional probability is defined the usual 
way 



P(L*|S'^) 



E 



P(L\...i'^|5^). (32) 



Equation (|29]) differs from its classical analogue Eq. (|5]l by 
a summation over which reflects the coset structure of 
the code. Aside from this distinction, the maximum-likelihood 
decoders are defined as in the classical case. 

D. Comparison between stabilizer codes and classical linear 
codes 

One of the main advantage of the stabilizer formalism is 
that it allows to discretize a seemingly continuous problem by 
studying the effect of Pauli errors (which are discrete) on the 
continuous code subspace. By classifying these errors, discrete 
quantities such as error syndromes or parity-check matrices 
arise naturally. In other words, stabilizer codes share many 
analogies with classical Unear codes, but there are also some 
fundamental differences. Let us summarize these analogies and 
differences here. We assume in what follows that the relevant 
quantum quantities are defined for a stabilizer code ^ of length 
n and rate -. 

n 

Syndrome and parity-check matrix. The parity-check matrix 
H is a binary matrix of size (n — fc) x 2n. It differs from a 
classical parity-check matrix in two respects: 

1) Its rows Hi must be orthogonal with respect to the 
product, 

2) The syndrome s{P) of a Pauli error P in G„ is defined 
with the help of the Tkr-product (rather than by matrix 
multiplication): s{P) = * P)i<i<„-fc. 



Encoding matrix. It is a binary matrix V of size 2n x 2n 
and must be a symplectic matrix (and any symplectic matrix 
is the encoding matrix of a certain stabilizer code). Because 
it is a symplectic matrix V^^ = A„y-^A„, it plays a role 
analogous to both the classical encoding matrix and its inverse. 
Like the classical encoding matrix Eq. Q, it contains a 
generator matrix as a sub-matrix. Like the inverse of the 
classical encoding matrix Eq. it also contains a parity 
check matrix as a sub-matrix. The parity check matrix is 
formed of rows V2(k+i) , V2(/c+2) , • • • , V'2„ while the generating 
matrix consists of rows Vi, V2, . . . , V2k- The remaining rows 
V2fc+i, V'2(fc+i)+i, . . . , V2n-i are sometimes referred to as 
"pure errors" [34]. Indeed, taking the rows of V as generators 
of Gn, the syndrome associated to an element of G„ depends 
only on its pure error component. Hence, their classical 
analogue is the matrix [H^^)^ appearing in the classical 
encoding matrix Eq. (|4]). 

The encoding matrix V is associated to a (continuous) 
unitary encoding transformation V. Like in the classical case, 
the natural decoding process consisting in inverting V and 
measuring the last n — k qubits, which yields a syndrome 
that is associated to a parity check matrix. 
Code. We may define the discrete stabilizer code as in the 
classical setting as the set of errors with zero syndrome, that 
is 

Definition 19 (Discrete stabilizer code): The discrete sta- 
bilizer code C associated to the stabilizer set {Hi,i ~ 
l..n — k}, where the Hi's are independent mutually orthogonal 
elements of G„, is the subspace of G„ orthogonal to the Hi, 
that is 

C = {PeGn\ H,*P ^0,1 <i<n^k}, (33) 
or more succinctly C — C{I)^. 

Codewords. There is an important difference between the clas- 
sical setting and the quantum setting here. Since all elements 
of a coset of C{I) have the same effect on we make no 
distinction between the elements of such cosets. Therefore 
the codewords in the quantum setting are grouped in cosets 
of C{I). Note that all elements of the coset C{I) are the 
analogue of the zero codeword. With the notation introduced 
in the previous subsection we have 

C= \J C{L). (34) 

Minimum distance. In the classical setting, the minimum 
distance of a linear code is the smallest Hamming weight 
of a non-zero codeword. This definition carries over to the 
quantum setting with the coset C(/) playing the role of the 
zero codeword. Thus, the minimal distance of a code is the 
minimum weight 'w{P) of an element P of G — G(/). With 
this definition of the minimum distance d, it is straightforward 
to check that the number of errors which are corrected by 
a decoder which outputs the coset C{L,S^) containing the 
element P of lowest weight and satisfying the syndrome 
is equal to [^^J . 



Information symbols. There is in the quantum setting a natural 
notion of information sequence corresponding to a Pauli error 
P which consists in taking the element L in Gk such that 
there exists an S in Gn-k for which {L : S)V — P. 

IV. Quantum turbo-codes 

In this section, we describe quantum turbo-codes obtained 
from interleaved serial concatenation of quantum convolu- 
tional codes. This first requires the definition of quantum 
convolutional codes. We will define them through their circuit 
representation as in [31] rather than through their parity-check 
matrix as in [15], [18], [1]: this allows to define in a natural 
way the state diagram and is also quite helpful for describing 
the decoding algorithm. 

A. Quantum convolutional codes 

A quantum convolutional encoder can be defined quite 
succinctly as a stabilizer code with encoding matrix V given 
by the circuit diagram of Fig. |6] The circuit is built from 
repeated uses of the seed transformation U shifted by n qubits. 
In this circuit, particular attention must be paid to the order 
of the inputs as they alternate between stabilizer qubits and 
logical qubits. This is a slight deviation from the convention 
established in the previous section, and it is convenient to 
introduce the following notation to label the different qubits 
appearing in the encoding matrix of a quantum stabilizer code. 

Definition 20: The positions corresponding to L are called 
the logical positions and the positions corresponding to S are 
called the syndrome positions. 

The total number of identical repetition of the seed transforma- 
tion U is called the duration of the code and is denoted N. The 
m qubits that connect gates from consecutive time slices are 
called memory qubits. The encoding is initialized by setting 
the first m memory qubits in the lO,,,.) state. To terminate the 
encoding, set the k information qubits of the last t time slices 
in the |Ofc) state, where f is a free parameter independent of 
N. The rate of the code is thus kN/{n{N + t) + ni) which is 
of the form k/n + 0{l/N) for fixed t. 

Formally, a quantum convolutional code can be defined as 
follows. 

Definition 21 (Quantum convolutional encoder): Let n, k, 
m, and t be integers defining the parameters of the code, 
and N the duration of the encoding. Let [/be an (n + m)- 
qubit symplectic matrix called the seed transformation. The 
encoding matrix V of the quantum convolutional encoder is a 
symplectic matrix over m + n{N + t) qubits given by 

y — C^[l...n+m]C^[n+1...2n+m] ■ • ■ t^[(A'+t-l)ri+l...(Af+t)n+m] 
N+t 

= W ^(i-l)ri+l. .in+m] 
1 = 1 

where \a..h] stands for the integer interval {a, a + 
!,...,&} and where C/[(i-i)„+i..m+m] acts on an element 

(,Pi,...,Pm+n(N+t)) e G,n+n{N+t) such that its image 
{Pit ■ ■ ^P'm+niN+t)) satisfies: (P^'._^-|^^_j^^, . . . , P/„^„J = 
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Li-\U 

Si 



-Pi 



S2-\ \- 
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- u 
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Fig. 6. Circuit diagram of a quantum convolutional encoder with seed 
transformation U. Tlie superscript indicating tlie number of qubits per wire 
are omitted for clarity, and can be found on Fig. |7] 
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Fig. 7. Seed transformation circuit. 



iP{i-i)7i+i, ■ ■ ■ ,Pin+m)U and all other Pi are given by 
P- = Pi. The syndrome symbols correspond to the positions 
belonging to [l..m] U UiG[i..Af] ~ 1)" + m + k + l..in + 
™] U [jie[N+i..{N+t)] [(» - 1)" + rn-in + to]. 

It will be convenient to decompose an element P in 



G 



7i{N+t)+m 



as P = (Pi : P2 



PN+t) where the Pi 



belong to Gn for z in {1, 2, . . . , + t — 1} and PN+t belongs 
to Gn+,7i- This decomposition directly reflects the structure of 
the output wires appearing on the right-hand-side of the circuit 
diagram of Fig. |6] 

Similarly, we will decompose the Pauli-stream obtain by 
applying the inverse encoder to P as 



{So : Li : Si 



N ■ Sn : SjM+i 



SN+t)^PV~\ 



where 5*0 belongs to Gm, the L^'s all belong to Gk, the Si's 
belong to G„_fc for i in {1, ... , N} and the Sn+j's belong 
to Gn for j in {1, . . . , i}. This decomposition directly reflects 
the structure of the input wires appearing on the left-hand-side 
of the circuit diagram of Fig. [6] 

While the Pj are related to the Lj and Sj via a matrix V 
of dimension 2(A^ + t)n + 2m, the convoluted structure of V 
can be exploited to recursively compute this transformation 
without the need to manipulate objects of size increasing 
with N. This requires the introduction of auxiliary memory 
variables Mj £ Gm- The recursion is initialized by setting 



N+t-l 



SN+t) — Pn+iU 



(35) 



The Sj for i G . . . , N+t—1} are obtained by recursion 

on i: 

{M,_i:S,)^Pr.M,)U-' (36) 

and the Afi_i, Li, Si for i in {!,..., N} are obtained from 
the recursion 



Finally, set 



(M,_i :L, :5,)-(P. : A/^)C/-' 



^0 = Mo. 



(37) 
(38) 



Any Clifford transformation U on n + m qubits can be used 
as a seed transformation and defines a convolutional code. It 
will be useful to decompose U into blocks of various sizes 



U = 



2n 


2m 






Mm 


\ }2m 


Ap 


Am 


\}2k 






/ }2{n 



(39) 



Just like in the classical case, this definition of quantum 
convolutional code can easily be seen to be equivalent to the 
ones that have previously appeared in the literature [15], [18], 
[1]. In particular, the I^-transform associated to the code can 
easily be obtained from the sub-matrices of V appearing in 
Eq. ([39|. However, these concepts will not be important for 
our analysis. 

Our definition of convolutional code is stated in terms of 
their encoding matrix V. From this perspective, convolutional 
codes are ordinary, albeit very large, stabilizer codes. How- 
ever, there are important aspects of convolutional codes that 
distinguish them from generic stabilizer codes. 

As mentioned in the previous section, stabilizer codes have 
in general encoding circuits using a number of elements 
proportional to the square of the number of physical qubits. 
Convolutional codes have by definition circuit complexity that 
scales linearly with N for fixed to: each application of the 
seed transformation U requires a constant number of gates, 
and this transformation is repeated N + t times. 

The most important distinction however has to do with the 
decoding complexity. The maximum-likelihood decoder of a 
stabilizer code consists in an optimization over the logical 
cosets, of which there are 4^^ where K denotes the the number 
of encoded qubits. Without any additional structure on V, 
maximum-Ukelihood decoding is an NP-hard problem [5]. 
Quantum convolutional codes on the other hand have decoding 
complexity that scales linearly with K. The algorithm that 
accomplishes this task will be described in details in Sec. [V] 

B. State diagram 

We will now define some properties of convolutional codes 
that will play important roles in the analysis of the perfor- 
mance of turbo-codes. Most of these definitions rely on the 
the state diagram of a convolutional code, which is defined 
similarly as in the classical case. 

Definition 22 (State diagram): The state diagram of an en- 
coder with seed transformation U and parameters (n, k, m) is 
a directed multi-graph with 4™ vertices called memory-states. 



each labeled by a AI E G,n. Two vertices M and A/' are 
linked by an edge M — > M' with label (L, P) if and only if 
there exists L e Gk, P e Gn and a 5^ G {/, such 
that 

P : M' ^ {M : L : S'')U, . (40) 

The labels L and P are referred to as the logical label and 
physical label of the edge respectively. 

Thus, the state diagram represents partial information about 
the ti-ansformation {M : L : S') {P : M') generated 
by the seed transformation U . Partial information because all 
information about 5'^ is discarded. Note that £ {I, Z}"^'"', 
so the state diagram only contains information about the 
streams of Pauli operators that remain in the set of codewords 
C. The restriction on the input can be lifted if we instead 
consider the effective seed transformation 
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where the matrix [Ep : Em] is obtained by removing every 
second row from the matrix [fip : il^] (i.e. the rows which 
represent the action on the X^). This definition will be conve- 
nient for later analysis. 

The state diagram of the seed transformation represented 
at Fig. |8] is shown at Fig. |9] For instance, the self-loop at / 
labeled {I, II) represents the trivial fact that {I : I : I)U = 
{II : I). The edge from Y to I labeled {Y,XZ) represents 
the transformation {Y : Y : I)U = {XZ : I), and so on. 



e — M 



Fig. 8. Seed transformation for an n = 1, fc = 1, and m = 1 
quantum convolutional code. It corresponds to a unitary transform which maps 
\a) ® \b) |c) to \a) (S) \a + b}\a + b + c) for a, b,c e {0, 1}. Therefore 
the seed transformation U acts as follows on the Zi and Xc Z\U = 
(Z,I,I)U = {Z,I,I),XiU = {X,X,X),Z2U = {Z,Z,I),X2U = 
(/, X,X),Z3U = (I, Z,Z),X3U = (I, I, X). 



The state diagram is crucial for analyzing the properties of 
the associated code, and also for defining some of its essential 
features. Here, we give some definitions based on the state 
diagram that will be important in our analysis. 

Definition 23 (Path): A path in the state diagram is a se- 
quence of vertices Mi, M2, ■ ■ ■ such that Mi M^+i belongs 
to the state diagram. 

Each element of G is naturally associated to a path in the state 
diagram, which corresponds to the memory states visited upon 
its encoding. The physical- and logical-weight of a codeword 
can be obtained by adding the corresponding weights of the 
edges in the path associated to the codeword. More generally, 
we will refer to the weight of a path as the sum of the weight 
of its edges. 




Fig. 9. State diagram for the seed transformation shown at Fig. [8] 



Definition 24 (Zero-physical-weight cycle): A zero- 
physical-weight cycle is a closed path in the state diagram 
that uses only edges with zero-physical-weight. 

In the state diagram of Fig. |9] for example, there are two 
zero-physical-weight cycles corresponding to the transforma- 
tions (/:/:/)[/= (// : /) and {Z : Z : Z)U = (// : Z). 

Definition 25 (Non-catastrophic encoder): An encoder G 
is non-catastrophic if and only if the only cycles in its state 
diagram with physical-weight have logical weight 0. 

We see for instance that the state diagram of our running 
example is catastrophic due to the presence of the self-loop 
with label {Z,II) at state Z: this cycle has physical-weight 
and logical weight 1. To understand the consequences of a 
catastrophic seed transformation, consider the act of inverting 
the encoding transformation of the associated convolutional 
encoder This is done by running the circuit of Fig. [6] back- 
wards. Suppose that a single Y error affected the transmitted 
qubits. More specifically, at time i, 1 < i < N, there is 
a y on the lower physical wire of the seed transformation 
of Fig. [s] (i.e. P^ = Y) and everything else is /. Since 
{lY : /)C/~i ^ {Z : Y : X), this will resuh in a Z in 
the memory qubit Mi^i, a F in the logical qubit Li, and a X 
in the stabilizer qubit Si. The Si = X triggers a non- trivial 
syndrome, which signals the presence of an error Moreover, 
because of the self-loop at M = Z that has non-zero logical- 
weight but zero physical- weight, this error will continue to 
propagate without triggering additional syndrome bits, while 
creating Z's in Li_i and Mi-2, and in Li_2 and Mi^^, and so 
on. Thus, an error of finite physical-weight results in an error 
of unbounded logical-weight, and a finite syndrome. This is 
the essence of catastrophic error propagation. 

Catastrophic encoders may have large minimal distances, 
but perform poorly under iterative decoding. All the codes 
we have considered in our numerical simulations were non- 
catastrophic. In fact, they even satisfied a stronger condition: 

Definition 26 (Completely non-catastrophic code): A com- 
pletely non-catastrophic code is such that the only loop in its 
state diagram with physical-weight zero is the self-loop at /„j. 



In the classical setting, non-catastrophicity is insured for 
instance by the use of systematic encoders. For such encoders, 
the logical string c is contained as a substring of the encoded 
string c = cV. Systematic quantum encoding can be obtained 
by setting the first k columns of Ap = llfc and the first k 
columns of Ep = /ip = (c.f. Eq. (|4T]l). However, this 
would imply that the stabilizers act trivially on the first k 
output qubits, resulting in a minimal distance equal to 1. 
We conclude that it is not possible to design a systematic 
quantum encoder with minimal distance greater than 1 . Thus, 
non-catastrophicity is a condition that needs to be built in by 
hand. Fortunately, it can be efficiently verified directly on the 
state diagram and we have made great use of this fact. 

In the classical setting, turbo-codes can be designed with a 
minimal distance that grows polynomially with N when the 
inner code is recursive. Recall that recursive means that the 
encoder has an infinite impulsive response: when a single 1 
is inputed at any logical wire of the encoding circuit Fig. |2] 
and every other input is 0, the resulting output has infinite 
weight for a code of infinite duration. This definition can be 
generalized to the quantum setting. 

Definition 27 (Quasi-recursive encoder): Consider execut- 
ing the encoding circuit of Fig. |6] on an input containing a 
single non-identity Pauli operators on a logical wire, with all 
other inputs set to /. The corresponding encoder V is quasi- 
recursive when the resulting output has infinite weight when 
the code has infinite duration N. 

However, it can be verified that this notion of recursiveness 
is too weak to derive a good lower bound on the minimal 
distance of turbo-codes. This departure from the theory of 
classical codes stems from the fact that quantum codes are 
coset codes. As in the classical case, the proper definition 
of a recursive encoder demands that it generates an infinite 
impulsive response. The novelty comes from the fact that 
this must be true for every elements in the coset associated 
to the impulsive logical input: not only must the encoded 
version of Xi, Yi, and Zi have weight growing with the 
duration of the code N, but so must every elements of C{Xi), 
C{Yi), and C{Zi). Formally, we can define recursive quantum 
convolutional encoders in two steps: 

Definition 28 (Admissible path): A path in the state dia- 
gram is admissible if and only if its first edge is not part 
of a zero physical-weight cycle. 

Definition 29 (Recursive encoder): A recursive encoder is 
such that any admissible path with logical-weight 1 starting 
from a vertex belonging to a zero physical-weight loop does 
not contain a zero physical-weight loop. 

Once again, this property can be directly and efficiently 
tested given the seed transformation of the convolutional code 
by constructing its state diagram. 

C. Interleaved serial concatenation 

Quantum turbo-codes are obtained from a particular form 
of interleaved concatenation of quantum convolutional codes. 
Interleaving is slightly more complex in the quantum setting 
since in addition to permuting the qubits it is also possible 



to perform a Clifford transformation on each qubit which 
amounts to permute X, Y and Z. More precisely: 

Definition 30 (Quantum interleaver): A quantum 
interleaver H of size N is an A^-qubit symplectic 
transformation composed of a permutation tt of the A^ 
qubit registers and a tensor product of single-qubit symplectic 
transformation. It acts as follows by multiplication on the 
right on Gat: 

(Pi,..., Pat) ^ {P^(^i)Ki, . . . , P^(^n)Kn) 

where Ki, . . . , are some fixed symplectic matrices acting 
on Gi. 

It follows that interleavers preserves the weight of A^-Pauli 
streams. An interleaved serial concatenation of two quantum 
encoders has three basic components: 

1) An outer code encoding fc*^"' qubits by embedding them 
in a register of n*^"* qubits, with encoder 

2) An inner code encoding fc^" qubits by embedding them 
in a register of n^" qubits, with encoder y^ut which 
is such that fc^" = 



3) A quantum interleaver 11 of size N 



, Out j,In 



The resulting encoding matrix of the interleaved concate- 
nated code is a symplectic matrix V acting on G„in such that 

with the action of yO"* and 11' on G„in being defined by 

(i : S-O"' : S'i")l/'0"t ^ ((i : ^Out^yOut . ^in^ ^42) 

for (i : 5°"' : 5^") G G^out x G„out_fcOut x G„in_fcin, and 
{L' : 5^")n' = (P'n : 5^") (43) 



for L' e G„out. These relations are summarized at Fig. 10 



The rate of the concatenated code is equal to 



-jjM~n;, that is the product of the rates of the inner code 
and the outer code. 

A serial quantum turbo-code is obtained from this inter- 
leaved concatenation scheme by choosing yin 
quantum convolutional encoders. 
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Fig. 10. Circuit diagram for a turbo encoder. 

D. Figure of merit 

There are a number of in-equivalent ways of characterizing 
the performance of a code. We might use the minimum 
distance, but a quantity that is more informative is the weight 
enumerator, which counts the number of undetected harmful 
errors of each weight. For a convolutional code however, as 
the number K of encoded qubits tends to infinity, the weight 
enumerator will be infinite. Indeed, because of the translational 



invariance of the encoding circuit, finite error patterns come in 
an infinite number of copies obtained by translation. Instead, 
we can consider the distance spectrum of a non-catastrophic 
encoder, which is defined as follows. 

Definition 31 (Distance spectrum): The distance spectrum 
{F{w))w>o of a non-catastrophic convolutional encoder is a 
sequence for which F{w) is the number of admissible paths 
in the state diagram starting and ending in memory states that 
are part of zero-weight cycles, and with physical-weight w 
and logical weight greater than 0. 

An other relevant quantity is the distance spectrum for logical- 
weight-one elements of C, which is defined similarly. 

Definition 32 (Logical-weight-one distance spectrum): 
The distance spectrum for logical-weight-one codewords 
Fi{w) of a non-catastrophic convolutional encoder is the 
number of admissible paths in the state diagram starting and 
ending in memory states that are part of zero-weight cycles, 
and with physical-weight w and logical weight 1. 

It can easily be seen that the minimum distance of a turbo- 
code obtained from the concatenation of two convolutional 
codes is no greater than where di = minu,{Fi(ti;) > 

0} and the free minimal distance is = min^{F(ti;) > 0}. 
The free distance is defined similarly to the classical case 
by the smallest weight of a harmful undetected error in the 
convolutional code with infinite time duration. It is so-to- 
speak a kind of typical minimal distance for convolutional 
codes, ignoring finite-size effects. To maximize the minimum 
distance of the turbo-code, we must use outer codes with large 
free distances and inner encoders with large value of di. 
Recursive encoders for instance have di proportional to N, 
and therefore serve as ideal inner codes. However, it happens 
that we cannot use recursive encoders as inner codes as we 
will see in the next section. Hence, a good rule of thumb is to 
use inner encoders that minimize the value of Fi{w) at small 
w, and similarly use an outer code which minimizes the value 
of F{w) at small w. These will result in a turbo-code with a 
distance spectrum that is small at low distances. 

Finally, given an error model, the word error rate (WER) 
and qubit error rate (QER) provide a good operational figure 
of merit. The QER is the probability that an individual 
logical qubit is incorrectly decoded. In other words, the QER 
represents the fraction of logical qubits that have errors after 
the decoding. The WER is the probability that at least one 
qubit in the block is incorrectly decoded. We expect in general 
QER <C WER. The WER is thus a much more strenuous 
figure of merit that the QER. For instance, if N qubits are 
encoded in N/k block codes for some constant k, then as 
N increases, the WER approaches 1 exponentially while the 
QER remains constant. As we will see, turbo-codes have a 
completely different behavior. In general, we will be interested 
in the WER averaged over the choice of interleaver 11. 

E. Recursive convolutional encoders are catastrophic 

In the classical setting, non-catastrophic and recursive con- 
volutional encoders are of particular interest. When used as the 
inner encoder of a concatenated coding scheme, the resulting 



code has a minimal distance that grows polynomially with 
their length and offer good iterative decoding performances. 
More precisely, random serial turbo-codes have a minimum 

distance which is typically of order N when the inner 

encoder is recursive, where N is the length of the concatenated 
code and d^^* the free distance of the outer code [22]. That 
the encoder be non-catastrophic is important to obtain good 
iterative decoding performances. 

This result and its proof would carry over the quantum 
setting almost verbatim with our definition of recursive en- 
coders. The quantum case is slightly more subtle due to the 
coset structure of the code. Unfortunately, such encoders do 
not exist; 

Theorem 1: Quantum convolutional recursive encoders are 
catastrophic. 

This result is perhaps surprising since the notions of catas- 
trophic and recursive are quite distinct in the classical setting. 
Nonetheless, the stringent symplectic constraints imposed to 
the seed transformation U gives rises to a conflicting relation 
between them. The proof of Theorem [T] is rather involved. 
Here we present its main steps and leave the details to the 
appendix. 

The proof involves manipulation of the rows of the effective 
encoding matrix Eq. ( |4T] i and for that reason, it is more ap- 
propriate to view effective Pauli operators as elements of Fj". 
The proof proceeds directly by demonstrating that the state 
diagram of any recursive convolutional encoder contains a 
directed cycle with zero physical-weight and non-zero logical 
weight. We first need a characterization of the memory states 
M that can be part of a zero physical-weight cycle. We break 
this into three steps. First, we characterize the set of states 
that are the endpoint of edges in the state diagram with zero 
physical -weight edges. In other words, we want to find all 
possible values for the memory element M' in Fj™ such that 
there exist M e F^™, S G F^"'', and L e F^*^ such that 
(M : L : S)U,« = (02„ : M'). 

Lemma 2: Given a seed transformation U, let S to be the 
subspace of Fj™ spanned by the rows of Em- The set of 
endpoints of edges with zero physical label is equal to 
and conversely, any M in is the end vertex in the state 
diagram of exactly one edge of zero physical-weight. 

If the state diagram contains a zero physical-weight cycle, 
it is therefore necessarily supported on the subset of vertices 
S^- However, edges of zero physical-weight with endpoints 
vertices in may originate from vertices outside S^- Such 
edges are not part of zero-physical-weight cycles. The next 
step is thus to characterize the set of endpoints of edges 
with zero physical label and starting point in S^- Since in 
the absence of other inputs each time interval modifies the 
memory state by M M fiM, we intuitively expect this set 
to be S^, where So is the smallest subspace containing S and 
stable by fi^- This is confirmed by the following lemma. 

Lemma 3: Given a seed transformation U, let 

oo 

^o = ^5Mm. (44) 

1=0 



For any element M' of Sq, there exists a unique element M 
in 5q, such that there is an edge of physical- weight from 
M to M'. 

This lemma narrows down the set of vertices in the state 
diagram that can support zero physical-weight cycles. In 
particular, we can define a sub-graph of the state diagram 
obtained from the vertex set Sq and directed edges with trivial 
physical labels. This subgraph is guaranteed to have constant 
in-degree 1 for all its vertices, but some of its vertices may 
have no outgoing edges. These would definitely not be part 
of a cycle. To ensure that all vertices in the subgraph have a 
positive number of outgoing edges, we must once more restrict 
its set of vertices. The (left) nullspace of /i^'s will play a 
fundamental role. 

Notation 7: Let /i be a linear mapping from Fj'" to itself. 
We denote by Null(/i) the (left) nullspace of /i, that is 

Null(/i) = {M e F2"|M^ = O2™}. 

Notation 8: Let ^A^) = X^I^i Null(^^) and % ^ Sq + Hi- 
Let be a sub-graph of the state diagram obtained from the 
vertex set 'V^ and edges with trivial physical label. This graph 
is called the kernel graph of the quantum convolutional code 
with seed-transformation U . 

By replacing the vertex set 5^ by 'V^, our goal was to 
eliminate any vertex with no outgoing edge. This turns out to 
be successful as shown by the following lemma. 

Lemma 4: The kernel graph has constant in-degree 1 and 
positive out-degree for any vertex. 

Thus, any cycle with zero physical-weight must be sup- 
ported on the kernel graph of the seed transformation. The 
next step in order to prove Theorem [T] is to demonstrate 
that when the quantum convolutional encoder is recursive, its 
corresponding kernel graph Q does not only consist of the 
single zero vertex with a self-loop attached to it, corresponding 
to the trivial relation (02m : ■ 0„-fc)f^eff = (02™ : 02,„). 

Lemma 5: The kernel graph of a recursive quantum convo- 
lutional encoder has strictly more than one vertex. 
This result is an essential distinction between the quantum and 
the classical case. In the classical case, when the memory state 
is non-zero, it is always possible to create a non-zero physical 
output for instance by copying the state of the memory at the 
output. But this is not possible quantum-mechanically. 

Before proving that g contains a cycle with non-zero 
logical weight, we will first prove that it contains at least one 
edge with non-zero logical weight. For this purpose, let us 
characterize the subset of edges with zero physical-weight and 
zero logical weight. 

Lemma 6: Given a seed transformation U, let L be the 
subspace of F2'" spanned by the rows of Am and Em- The 
set of endpoints of edges of zero physical and logical weight 
is equal to L-^. 

This result and its proof are structurally similar to Lemma |2] 
except that S has been replaced by L. From this, we conclude: 

Lemma 7: The kernel graph of a recursive quantum con- 
volutional encoder contains an edge with non-zero logical 
weight. 



Armed with this result, we are now in a position to prove the 
main result of this section. 

Proof: (of Theorem^) Consider a recursive quantum 
convolutional encoder and its associated kernel graph. By 
Lemma |7] this graph has at least one edge with non-zero 
logical weight. Let us say that it goes from Mq to Mi. From 
Lemma |4] we can follow a directed path of arbitrary length I 
with {Mo, Ml) as starting edge: 



Mo Ml 



Mt-i ^ Mf 



If the length of the path is greater than the number of vertices 
of the graph it must contain at least twice the same vertex. 
Moreover, A/q must be part of this cycle. Otherwise, we would 
have a path of the form Mq Mi . . . Mj ^^j+i — * 
■ ■ ■ ^ Ml — Mj with j > 0. In this case, Mj would have 
in-degree 2 which is impossible. In other words, there is a 
directed cycle in the state diagram with zero physical-weight 
and non-zero logical weight. The corresponding convolutional 
encoder is therefore catastrophic. ■ 

V. Decoding 

This section describes the decoding procedure for turbo- 
codes operated on memoryless Pauli channels. With an n- 
qubit memoryless Pauli channel, errors are elements of G„ 
distributed according to a product distribution P(Pi : P2 : . . . : 
Pn) = h{Pi)f2{P2) ■ ■ ■ fn{Pn)- The depolarizing channel 
described in Section [Hi] is a particular example of such a 
channel where all fj are equal. We note that our algorithm can 
be extended to non-Pauli errors using the belief propagation 
algorithm of [25], but leave this generalization for a future 
paper The decoding algorithm we present is an adaptation 
to the quantum setting of the usual "soft-input soft-output" 
algorithm used to decode serial turbo-codes (see [2]). It differs 
from the classical version in several points. 



2) 



1) As explained in Subsection III-C for decoding a quan- 
tum code we do not consider the state of the qubits 
directly (which belong to a continuous space and which 
cannot be measured without being disturbed) but instead 
consider the Pauli error (which is discrete) that has af- 
fected the quantum state. Decoding consists in inferring 
the transformation that has affected the state rather than 
inferring what the state should be. 
Decoding a quantum code is related to classical "syn- 
drome decoding" (see [27, chapter 47]) with the caveat 
that errors differing by a combination of the rows of 
the parity-check matrix act identically on the codewords. 
Thus, maximum-likelihood decoding consists in identi- 
fying the most likely error coset given the syndrome. 
The coset with largest probability can differ from the 
one containing the most likely Pauli error 
3) We cannot assume as in the classical case that the soft- 
input soft-output decoder of the convolutional quantum 
code starts at the zero-state and ends at the zero-state. 
This is related to the fact that the memory is described 
in terms of the Pauli error that has affected the qubits 
rather than reflecting a property of the encoded state. 



Instead, we perform a measurement which reveals partial 
information (the X component) about the first memory 
element. 

Let us now describe how each constituent convolutional 
code is decoded with a soft-input soft-output decoder. 

A. Decoding of convolutional codes 



As stated in Definition 18 qubit-wise maximum likelihood 
consists in finding the logical operator Li that maximizes 
the marginal conditional probability P{Li\S^). We call the 
algorithm that computes this probability - but without re- 
turning the Li that optimizes it - a soft-input soft-output 
(SISO) decoder. The purpose of this section is to explain how 
such a decoder can be implemented efficiently for quantum 
convolutional codes. 

We choose to base our presentation solely on the circuit 
description of the code. Our algorithm is essentially equivalent 
to a sum-product algorithm operated on the trellis of the code 
[33]. However, the novelties of quantum codes listed above re- 
quire some crucial modifications of the trellis-based decoding. 
We find that these complication are greatly alleviated when 
decoding is formulated directly in terms of the circuit. 

Since the distinction between trellis-based and circuit-based 
decoding are technical rather conceptual, we will present 
the procedure in details and omit its derivation from first 
principles. As usual, when operated on a memoryless Pauli 
channel, the whole procedure is nothing but Bayesian updating 
of probabilities. 

Consider a quantum convolutional code with parameters 
{n, k, 171,1), seed transformation U and duration N as shown 
at Fig |6] We use the same notation as in Subsection IV-A 
and denote by V the associated encoding matrix. Let us recall 
that it maps Gn(N+t)+m to itself. As in Subsection IV-A we 
decompose an element P in Gn{N+t)+m (i-S- an eiTor on the 
channel) as P = (Pi : P2 : • • • : P/v+t) where the P^'s belong 
to G„ for i in {1,2, . . . , N + t — 1} and Pjv+t belongs to 
Gn+m- It will be convenient to denote the coordinates of each 
Pi by P/, i.e. P, = {P} : Pf : ... : Pp) where the Pf 's 
belong to Gi. 

Similarly, we will decompose the Pauli-stream obtain by 
applying the inverse encoder to P as 



{So : Li : Si 



jn ■ Sn : Sn+1 



SN+t)^PV'\ 



where Sq belongs to Gm, the Li's all belong to Gk, the Si's 
belong to Gn-k for i in {1, . . . , N} and the Sn+j's. belong 
to Gn for j in {!,...,<}. 



As explained in Subsection IV-A the Li and Si can be 
obtained from the P,; via a recursion relation Eqs ( 35p7 



which uses auxiliary memory variables Ali. This recursive pro- 
cedure can be understood intuitively from the circuit diagram 
of Fig. |6] It simply consists in propagating the effective Pauli 
operator P from the right to the left-hand-side of the circuit. 
This can be done in N + t steps, each step passing through a 
single seed transformation U, and the memory variables Mj 
simply represent the operators acting on the memory qubit 
between two consecutive seed transformations. The decoding 
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Fig. 11. Information flow in the iterative turbo decoding procedure. 



algorithm actually follows the same logic. As explained in 



Sec. III-C the probability on L and S is obtained from the 
puUback of P(P) through the encoder V (c.f Eq. For 
a convolutional code, this pullback can be decomposed into 
elementary steps, each step passing through a single seed 
transformation U and computing intermediate probabilities on 
the memory variables. 

In addition to the procedure just outlined, the decoder 
must also update the probability P(P) obtained from the 
pullback of P(P) conditioned on the value of the observed 
syndrome. This operation is slightly more subtle, and requires 
not only the pullback of probabilities through the circuit, 
but also their push-forward (propagating from the left to the 
right-hand-side of the circuit). For that reason, the decoding 
algorithm presented at Algorithm 1 will consist of three steps, 
a backward pass (Algorithm 2), a forward pass (Algorithm 
3), and a local update (Algorithm 4). As indicated by their 
names, these respectively perform a pullback of probabilities, 
a push-forward of probabilities, and finally an operation that 
combines these two probabilities into the final result. 

Our description of these algorithms make use of the follow- 
ing notation: 



S — {Si)o<i<N+t 
S<i = {Sj)Q<j<i 
S>i == {Sj)i<j<N+t 



(45) 
(46) 
(47) 



and we denote by Up the binary matrix formed by the 2n first 
columns of U and by Um the binary matrix formed by the 
2m last columns of U. This means that 

P, = (AVi : : 5,)[/p 
M, = (M,_i : : S^)Um, 



where the Mi are defined from Equations ([35]l,(p6)l and (37i. 
The notation P(Mi) oc . . . means that entries of the vector 
((P(Mi — are proportional to the corresponding 

right-hand side term, the proportionality factor being given 
by normalization. Finally, for any integer n, we denote 
[n]^{l,2,...,n}. 

B. Turbo decoder 

A turbo-code is built from the interleaved serial concatena- 
tion of two convolutional codes. The decoding of such a code 
uses the SISO decoder of its constituent convolutional codes 
in an iterative way that is schematically illustrated at Fig. 11 

The inner code is first decoded as described above but 
without any information on the logical random variables: 



Algorithm 1: The SISO algorithm for quantum convolutional codes 
INPUTS: 

P(P/) for i G [N + t], j £ [n], (and j G [n + m] when i = N + t) From physical noise model 
P(-L^) for i G [N], j G [k] From turbo decoder 

6"^ From syndrome measurement 

OUTPUTS: 

P(P/|S'=") for i&[N + t], j G [n] (and j e[n + m] when i = N + t), 
P(L^'|5'^) for i G [N], j G [/c] 
ALGORITHM: 
backward pass 
forward pass 
local update 



Algorithm 2: Backward pass 

INPUTS: 

Same as SISO algorithm 
OUTPUTS: 

F{M,\S%^) fovie[N + t]. 
ALGORITHM: 

{Initialization: P(M„+t) is given directly by the physical noise model.} 
for all 7 G Gjn do 

p(M„+, = 7) - n™ iP(p;^+^ = 7^) 

end for 

{Recursion: first t steps} 
for i = TV + t - 1 to iV + 1 do 

P{Mi\Sli) (X ^ [p(Pi+i = {Mi : a)Up)P{Mi+, = {Mi : a)C/M|5^i+i)' 
end for 

{Recursion: last steps} 
for z = to 1 do 

P{Mi\Sli) (X ^ [P(L, = A)P(Pi+i = (Mi : A : a)C/p)P(Mi+i = {Mi : X : a)C/M|^^i+i) 



end for 



P^"(iy^) is the uniform distribution. The distribution P^"(P/) 
is given directly by the channel model. The only output which 
is used in the following step is the output distribution on the 
logical variables given the syndrome measured on the inner 
code: P''^(Z/f (S^ really refers to the part of the syndrome 
measured for the inner code and not to the whole syndrome, 
but we do not attach a "In" to it to avoid cumbersome 
notation). 

Then, the outer code is decoded with the SISO algorithm, 
using as input distribution for the logical variables, as in the 
previous case, the uniform distribution. The input distribution 
of the physical variables P'-'"'(Pr') is deduced from the logical 
output distribution of the inner decoder: 

pO"t(/^=7) = pi"(L^K|=7|5-), 
where i"^ and j"^ are such that {i^,j'^) = 'K{i,j), and the Kf 



are the single-qubit symplectic transformations that appear in 
the quantum interleaver 11. This yields the output distributions 
pOut(p,j|^rr) p(iJ|5^) (again, 5^ only refers to the 
part of the syndrome attached to the outer code). This step is 
terminated by estimating the most hkely error coset L, setting 

Li=argmax,{pO-n^i=7l^")}- 

To iterate this procedure, use the output probability 
pOut^pj l^x^ as information on the logical variables of the 
inner code: in other words, set as input distribution for inner 
SISO decoding 

pi-(Lj = 7) = P°"*(PiC = lKf.\S-), 

and the distribution of the physical variables are set by the 
physical channel as before. This is represented by the feedback 



Algorithm 3: Forward pass 

INPUTS: 

Same as SISO algorithm 
OUTPUTS: 

P(A/,|5|J forie{0,..., 
ALGORITHM: 

{Initialization: } 

for all 7 e Gm do 
if 7^ = 5*;^ then 

P(Afo = 7l^J)- 2^ 
else 

P(Mo-7l^o")-0 
end if 
end for {Recursion: } 
for i = 1 to + t + 1 do 



N + t-l}. 



E 



Mi = {ii:\:a)UM 



P(L, = A)P (P, = : A : a)Up) P (A/,_i = /i|^<.-i) 



end for 



Algorithm 4: Local update 

INPUTS: 

Same as SISO algorithm 

P{Mi\Syi) for i £ [N + t] From backward pass 

P{M-\S^i) fov i€ {0,...,N + t-l} From forward pass 
OUTPUTS: 

P(P/|S'^) for ie[N + t], j e [n] (and j G [n + m] for N + t) 
PiLj\S'=) for i e [N] and j e [/fc] 
ALGORITHM: 
for z = 1 to + < do 



P(P,|5^) cx \P{P^)^{U = A)P(A/,_i mI^<,-i)P (M, = : a : a)UM\Sl 



CTeG„_fc:(T="=Sf 
Pi = (p:A:(T);7p 

end for 

{Marginalization: } 

Compute PiLilS"") from P(i.|S'^) 

Compute PiP/IS"^) from P(Pi|S'^) 



loop on Fig. 1 1 where information from the outer decoder is 
returned to the inner decoder 

This procedure can be repeated an arbitrary number 
of times, with each iteration yielding an estimate of the 
maximum-likelihood decoder of the outer code. The iterations 
can be halted after a fixed number of rounds, or when the 
estimate does not vary from one iteration to the next. Although 
the decoding scheme is exact for both constituent codes, the 
overall turbo-decoding is sub-optimal. The reason for this is 



that although P^"(P) is memoryless, the induced channel 
P°"'(P) = on the outer code obtained from P°"*(P/r) = 
P^'^{LjKf\S^) is not. The decoder ignores this fact and only 
uses the marginals P'^"*(P^r') of P'-'"'(P). This is the price 
to pay for an efficient decoding algorithm. 

VI. Results 

The convolutional codes we used for our construction of 
turbo-codes are for the most part generated at random. That 
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on the interleaver. which we chose comnletelv 



TABLE I 

Distance spectrum Fi{w) of logical-weight-one codewords 



is, we first generate a random seed transformation U of desired 
dimensions. Using its state diagram, we then test whether the 
corresponding encoder is catastrophic, and if so we reject it 
and start over Non-catastrophicity is the only criterion that 
we systematically imposed. 

As a first sieve among the randomly generated non- 
catastrophic seed transformations, we can study their distance 
spectrums and make some heuristic test based on it. Example 
of good seed transformations obtained from this procedure are 
[7(3 = {2085, 926, 2053, 1434, 910, 3943, 1484, 2881, 3212, 
2250^68,331}, 

[7(3 = {13159, 10335, 13127, 6554, 10319, 14441, 10625, 

5835^ 832, 13893, 11916, 11329, 8204, 5570}, 

[/(2 1 4) = {610, 3323, 760, 1591, 2500, 942, 2290, 794, 1535, 

2202^2859,809}, 

where the binary symplectic encoding matrix is specified 
by its list of rows and each row is given by the integer 
corresponding to the binary entry. The subscript on the 
encoders specify its parameters {n,k,m). Hence, the first 
two codes have rate | but differ by the size of their memory. 
The third code has a higher rate of ^. The first few values 
of the distance spectrum of logical-weight-one codewords for 
these quantum convolutional code are given at Table |l] while 
the distance spectrum of all codewords are listed at Table |ll] 
Based on those values, we conclude that the turbo-codes 
obtained from concatenation of code using seed transformation 
[/(3 1 4) with itself has a minimal distance no greater that 6 x 
4 = 24. Similarly, the codes obtained by the concatenation 



all cases, there are most likely lower weight codewords than 
the estimate provided by these lower bounds, but those are 
atypical. On the other hand, the codes are not decoded with 
a minimum distance decoder, so even a true large minimal 
distance does not imply low WER. 

The WER of a quantum turbo-code on a depolarization 
channel can be estimated using Monte Carlo methods. An 
error P G Gn is generated randomly according to the channel 
model probability distribution. The syndrome associated to 
this error is evaluated, and based on its value, the decoding 
algorithm (see Sec. [Vjl is executed. The decoding algorithm 
outputs an error estimate P'. If P — P' E C{I), the decoding 
is accepted, otherwise it is rejected. In other words, the 
decoding is accepted only if all K encoded qubits are correctly 
recovered. The WER is then the fraction of rejected decodings. 

The WERs as a function of the depolarizing probability p 
are shown for a selection of codes on Fig. T2p4 Perhaps the 



most striking features of those curves is the existence of a 
pseudo-threshold value of p below which the WER decreases 
as the number of encoded qubits is increases. Since the codes 
have a bounded minimal distance, this is not a true threshold in 
the sense that as we keep increasing the number of encoded 
qubits, the WER should start to increase. However, we see 
that for modest sizes K of up to 4000, this effect is not 
observed. We do see however that the improvement appears 
to be saturating around these values. The pseudo-threshold is 
particularly clear for the seed transformation C/(3^i_3), where 
it is approximately 0.098, and for the seed transformation 
U(2.i,4) where it is approximately 0.067. Its value for the seed 
transformation f7(3.i 4) is not as clear, but seams to be between 
0.95 and 0.11. 

These values should be compared with the hashing bound, 
whose value is approximately 0.16024 for a rate ^ code and 



of U, 



(3,1,4) 



with itself has minimal distance no greater than 0.12689 for rate 



7x6 = 42, and the one obtained from the concatenation of 
U{2.i,4) with itself has 8 x 6 = 48. 

These are upper bounds on the minimal distance and do 
not translate directly into the performance of the code. On the 
one hand, the actual minimal distance of a turbo-code depends 



|. We can also compare with the results 
obtained from LDPC codes in [28, Figure 10] by evaluating 
the depolarizing probability p at which the WER drops below 



10 . For a rate 



J, this threshold was achieved at pth 

2„ 



0.033 



(note the convention — |p) for LDPC codes while the 
turbo-code shown at Fig.[T4]has pt/j w 0.048. It should also be 



noted that this improved threshold is achieved with a smaller 
block size than that used for the LDPC in [28]; a larger block 
should further improve this result. 
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Fig. 12. WER vs depolarizing probability p for the quantum turbo- 
code obtained from the concatenation of the convolutional code with seed 
transformation ?7(3_i,3) with itself, for different number of encoded qubits 
K. Each constituent convolutional code has m = 3 qubits of memory and 
have rate ^ , so the rate of the turbo-code is g . 
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Fig. 14. WER vs depolarizing probability p for the quantum turbo- 
code obtained from the concatenation of the convolutional code with seed 
transformation C/(2,i,4) with itself, for different number of encoded qubits 
K. Each constituent convolutional code has m = 4 qubits of memory and 
have rate ^ , so the rate of the turbo-code is ^ . 



cases, the slope of the WER increases with the memory size. 
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Fig. 13. WER vs depolarizing probability p for the quantum turbo- 
code obtained from the concatenation of the convolutional code with seed 
transformation (7(3, 1,4) with itself, for different number of encoded qubits 
K. Each constituent convolutional code has m = 4 qubits of memory and 
have rate | , so the rate of the turbo-code is g . 

As expected, changing the rate of the code directly affects 
the value of the pseudo threshold. This is seen by comparing 
either of Figs. 12 or 13 to Fig. 14 The effect of the memory 
size is however less obvious. Comparing Fig. [12] and [13] it 
appears that the effect of a larger memory is to sharpen the 
slope of the WER profile below the pseudo threshold for fixed 
K. In other words, the main impact of the memory size is 
not in the value of the pseudo threshold, but rather in the 
effectiveness of the error suppression below that threshold. 
This conclusion is somewhat supported by Fig. [15] where the 
WER is plotted for a variety of memory configurations. In all 
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Fig. 15. WER vs depolarizing probability p for a quantum turbo-code 
encoding K = 100 qubits and rate | with different memory configurations 
(ml'i,mO"'). 



VII. Conclusion 

In this article, we have presented a detailed theory of 
quantum serial turbo-codes based on the interleaved serial 
concatenation of quantum convolutional codes. The descrip- 
tion and analysis of these codes was greatly simplified by the 
use of a circuit representation of the encoder In particular, 
this representation provides a simple definition of the state 
diagram associated to a quantum convolutional code, and 
enables a simple and intuitive derivation of their efficient 
decoding algorithm. 

By a detailed analysis of the state diagram, we have shown 
that all recursive convolutional encoders have catastrophic 
error propagation. Recursive convolutional encoders can be 



constructed and yield serial turbo-codes with polynomial min- 
imal distances. However, they offer extremely poor iterative 
decoding performances due to their unavoidable catastrophic 
error propagation. The encoders we have used in our con- 
structions are thus chosen to be non-catastrophic and non- 
recursive. While the resulting codes have bounded minimal 
distance, we have found that they offer good iterative decoding 
performances over a range of block sizes and word error rates 
that are of practical interest. 

Compared to quantum LDPC codes, quantum turbo-codes 
offer several advantages. On the one hand, there is complete 
freedom in the code design in terms of length, rate, memory 
size, and interleaver choice. The freedom in the interleaver 
is crucial since it is the source of the randomness that is 
responsible for the success of these codes. On the other 
hand, the graphical representation of turbo-codes is free of 4- 
cycles that deteriorate the performances of iterative decoding. 
Finally, the iterative decoder makes explicit use of the code's 
degeneracy. This feature is important because turbo-codes, like 
LDPC codes, have low-weight stabilizers and are hence greatly 
degenerate. 

In future work, we hope to surmount the obstacle of catas- 
trophic error propagation. A concrete avenue is the generalized 
stabilizer formalism of operator quantum error correction 
[34], which could circumvent the conclusions of our theorem 
established in the context of subspace stabilizer codes. Doping 
[45] is an other possibility that we will investigate. 
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Appendix 

To prove Lemma |2] we first establish some simple facts: 
Fact 4: The subspace of F2"+^™ orthogonal to all the rows 
of J/eff is the space spanned by the rows of its submatrix 
[Ep : Sm]. Similarly, the subspace of orthogonal to 



all the rows of 



Mm 



is the space spanned by the rows 



of 



Ap 
Sp 



Am 

2^M 



Proof: The subspace V of F^"+2" orthogonal to all the 
rows of C/eff is of dimension 2n + 2m — (2m + 2k + n — k) ~ 
n ~ k. We observe now that the rows of [Sp : Em] are all 
independent and all orthogonal to the rows of f/eff. They form 
therefore a basis of V . This finishes the proof of the first 
statement. The second one is obtained by similar arguments. 

■ 

Proof: (ofLemma^) Let M' € F^™ be such that there 
exist M e F2™, S e F^"'', and L € ¥f such that {M : L : 
S)Ues = (02„ : M'). Notice now that (02,i : A/') is spanned 
by the rows of J/eff and is therefore orthogonal to all the rows 
of the matrix [Sp : Sm]. This implies that M' belongs to S'^- 
Conversely, any row vector of the form (02n : M') with M' 
belonging to S'^ is orthogonal to all the rows of [Ep : Em] 



and is therefore spanned by the rows of C/eff- This implies that 

S e F"-^ and L e F^''^ such that 



there exist M e F|™, 
(Af : L : S*) C/eff = (02n : M'). Furthermore, it can be noticed 
from the fact that the rows of C/eff are independent, that if such 
an (A/ : L : S) exists, it is unique. ■ 
The proof of Lemma |3] requires a straightforward Fact and 
a Lemma. 

Fact 5: For any M, M' e Fj"" we have 

(Af/Ltp : Mjiu) * {M'fip : A/'/^m) = M * M' 
Proof: This is straightforward consequence of the or- 
thogonality relations satisfied by the first 2to rows of U. ■ 
Lemma 8: Let T G F2™ and let M' be such that there 
exist M e F|™, S e F^"'', and L € ¥f such that {M : L : 



5) C/eff = (02n : Af')- We have 



Proof: 



We observe that 



(48) 



M' ★ T/XM - (02„ : M') * (T^p : T//m) 
= (A//^P + LAp + S'Ep : 

A-f^M + -^Am + S'Em) * {T^l¥ : T^iu) 



where the last equation follows from the fact that any row 
of [Ap : Am] or [Ep : Em] is orthogonal to all the rows of 
[/ip : /im]- From this, we conclude that 

M' * TfiM = M * T. 

m 

Proof: (of Lemma [i]) Since A/' G S^, there exist by 
Lemma |^ M e F^", S G F^"*^, and L £ ¥f such that 
(Af : L : S)U^ff = (02„ : Af'). Let T e So- Using Lemma [s] 
we obtain 



Af ' ★ 



Notice now that Af ' ★ T/iM — since T/iM G -^o- From this 
M-kT = 0. This shows that Af belongs to too. The unicity 
of Af is a consequence of Lemma |2] ■ 
The following lemma is used in the proof of Lemma |4] 
Lemma 9: Let be a linear mapping from Fj™ to itself. 
Let V he a subspace of F2™ such that ii{V) C V and which 
contains the null space of any positive power of /i. Then for 
any Af in V-^ there exists Af ' in V-^ such that for any T in 

M*T = M' Tfi. 
Proof: We are first going to prove this statement in the 



case 



V=[J NuWifi* 



t=i 



This is a subspace of F^™ since the Null(/i*)'s are nested sets: 
Null(/x) C Null(^2) C • - - C Null(^*) C . . . . 



Let us consider the space Im(/i*) generated by the rows of /i*. 
Since Fj™ Z) Im(/i) D lm{^^) D . . . there must exist a posi- 
tive t such that Im(/i*)) = Im(/i*+^). In this case, /i(Im(/i*)) = 
hn(/i*). This impHes that the restriction of fi to Im(/i*) is a 
one-to-one mapping and that Null(/Li') fl Im(/Lt*) = {02m}- 
Since dim(Null(/i*)) + dim(Im(/i*)) = 2m, we can form a 
basis (Ti, . . . Ti, T,+i, . . . , T2„0 of Fi" such that (Ti, . . . , T,) 
spans Null(/i*) and (T/+i, . . . ,T2m) spans Im(/i*). Moreover, 
all the Null(/i^)'s are equal for v greater than or equal to t. 
This follows directly from the fact that the Im(/i^)'s are all 
equal in this case. This can be checked by using the relations 
dim(Null(/i")) + dim(Im(/i")) = 2m. From these equalities, 
we deduce that dim(Null(^*)) = dim(Null(^*+^)) = • • • = 
dim(Null(/x")) = .... The Null(/x")'s are nested sets and 
therefore Null(/i*) = Null(/^*+i) = ••• = Nul^^") = .... 
This implies that V = Null(/z*). We define Ui for i in 
{I + l,...,2m} as the unique element in F2'" such that 
Uifj, — Ti. There exists a unique M' such that 

M'^T, = Oforie {!,...,?} (49) 
M'^T, = Af for i e {/+ l,...,2m} (50) 



This M' belongs to V-^ by Equation (49 1. Note now that we 



have defined Af ' in such a way that M -kT coincides with 
M'-kTfi over the basis (Ti, . . . ,Ti, Ui+i, . . . , U2m)- Therefore, 
by linearity of the * product, we have M -kT — M' ★ Tfj, for 



all T in Fl™. 



The general case is direct consequence of this particular 



case. We define M' similarly by Equations (49i and pO\ and 
it is readily checked that M' belongs to V-^. ■ 
Proof: ( of Lemma |?| We know from Lemma |3] that for 
any element M' in S^, there exists a unique M in Sq such that 
there is an edge of zero physical-weight in the state diagram 
which goes from M to M' . To prove that the kernel graph 
has constant in-degree 1 we just have to prove that when M' 
belongs to the subset T^f^ of the corresponding M also 
belongs to this subset. Since for any T E 9^we have M'-kT = 
and since 9^ is stable by applying /im to the left we obtain 
for a such slT, M*T ^ M' -k T^iu = 0. This shows that M 
also belongs to which shows that M belongs to 

On the other hand, by applying Lemma |9] with V = '^b, we 
know that for any vertex M of the kernel graph, there is an 
M' belonging also to 'U^ such that for any T in F^™: 

M-kT = M'-kTfiM- 

Note that given such an Af there is a unique M which 
satisfies the aforementioned equality for all T. Therefore M 
is necessarily the starting vertex of the unique directed edge 
of physical-weight having as endpoint M'. ■ 
Proof: (of Lemma ^ We just have to prove that the set 
I/q is not equal to the whole space Fj™. We proceed by 
contradiction. Assume that I/q — 
exists a finite number t such that 



F2™. Notice now that there 



H = Null(/i^) + ^5A^M■ 



For such a t, any M in can be expressed as a sum M = 
N + ELo^^Mm' where N is in Null(/i^) and the T^'s all 
belong to S, i.e. they are of the form Ti = SiY.M for some 
S, e F^^-''. Consider now a finite path starting at the origin 
with logical weight 1 and non-zero physical-weight. We denote 
by M its endpoint (which is viewed as an element in F2™). 
We decompose Af/i^^ as explained before 



t+i 

M 



i=Q 



where the Si's belong to Fj. The path of length t which starts 
at M and which corresponds to the sequence of pairs of logical 
transformations/stabilizer transformations (02fe : Sq) — > {02k ■ 
Si) ^ ■ ■ ■ ^ {02k ■ St) will go from point M to 



By extending this path by feeding in t zero transformations 
(02fc : On-k) we go from vertex N to /i*(iV) which is equal 
to 02m by definition. This path may then continue by feeding 
in additional zero transformations and will stay at the zero 
vertex forever This contradicts the fact that the quantum code 
is recursive. ■ 
Proof: ( of Lemma |6| Let A/' be an element of Fj™ for 
which there exist M € F|™ and S G F'^'"'', such that {M : 
0^'^^ : S)U^ff = {02n ■■ M'). {02n ■■ M') is spanned by the 
rows of [/ip : /im] and [Ep : Em]. By Fact |5] this implies 
that (02n : M') is orthogonal to all the rows of the matrices 
[Ap : Am] and [Sp : Em]- Hence M' should belong to L^. On 
the other hand, any (02n : M') for which M' belongs to 
is orthogonal to all the rows of [Ap : Am] and [Ep : Em] and 
is therefore spanned by the rows of [/ip : /im] and [Ep : Em]. 

■ 

Proof: (of Lemma This amounts to prove that there 
exists a vertex in the kernel graph which does not belong to 
L^. The set of vertices of the kernel graph is 'P'^. Therefore, 
we need to find an element of L that is not in I/q. In particular, 
we would be done if there existed a row of Am which does 
not belong to I/q. 

Assume the opposite. Let t be the integer such that T/q = 
Null(^^) + ELo^Mm- Then, for every L in F|™ of weight 
1 and any integer k, there exists Sq, Si, 
a TV in Null(/i^) such that 



St in F""'' and 



LAuH^ — N + ^ St-i^M^J'M■ 

i=0 

Consider a finite path of non-zero physical-weight and logical 
weight 1 starting at the origin and ending at a vertex Af. 
Assume that this path corresponds to the sequence of pairs 
of logical/stabilizer inputs 



1=0 



{02k : 5o) ^ (02fc -.Si) 
{02k ■■ S,^i) ^ {L : S,) 



{02k ■ Si+i) 



{02k ■■ Su), 



(i.e. the only time where the logical transformation is non-zero 
is at time i and is equal to L which is assumed to be of weight 
1). The final memory state would then be 

u 

M = LAmMm"' + Su-^^M^^M■ (51) 

1=0 

Since, by assumption, the rows of Am are in T/q, there exists 
S'o,...,S't in F^-*-' and N' in Null(^^) such that 

u t 
i=0 i=0 

(52) 

Thus, if we extend the path by the sequence of inputs 

(02fc : 5-^) -> (02fc : 5i) > {02k : S[), 

we arrive at the vertex M' which satisfies 

t 

M' = M^ll+' + YSt-^M^^M 

i=0 

u 

— ^Am/^JJi^*^^ ' + S'ii_iEM/^J/*^^ 

i=0 

t 

+ ■s't'-i^MMM 

1=0 

= N' 

Extending this whole sequence by adding t zero transforma- 
tions (02fc : On-k) will bring this path back to the origin 
since N' in in the kernel of /i^. Once at the origin, then 
encoder can remain in that state forever without any additional 
physical output. This implies that the code is non recursive, 
and completes the proof. ■ 
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